Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | Go Back
View Post [edit]

Poster: Administrator, Curator, or Staffbrewster Date: Oct 24, 2004 2:06am
Forum: petabox Subject: about 400TB of this design shipped to the Archive

The Internet Archive built 160TB of machines to test the design, now capicorn technologies has just finished shipping the next 250TB of machines.

The Internet Archive ordered another 500TB of systems that should be delivered by early december.

Therefore we are on track for a petabyte this year, and ready to crank even more out.

Issues coming up revolve around the via boards, ram, and crating for international shipment. Via boards are getting more expensive not less, and they have occational problems with drives on the C and D IDE channels. we are getting some ram batches that fail ram tests, which could be the ram or via boards problems. crating of one rack in combination with shipper errors lead to it falling over and damaging the rack and some of the cases of the nodes.

We really want a micro-atx motherboard that has gigabit, low power, 4 sata ports. alas, we can not find such a beast yet.

-brewster

Reply to this post
Reply [edit]

Poster: matt-genesi-usa Date: Jun 23, 2005 10:08am
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

Quick plug;

http://www.pegasosppc.com/pegasos.php

Dual gigabit ethernet, fully supported under 2.6.12, could even cluster over Firewire (that's a fun trick :)

We don't have SATA though on the current model. And unfortunately we are using a CPU module which means each case requires 2U - but this helps because you can throw a riser card in there with a cheap SATA card on it.

I'd love to chat about this to see if we could make some improvements to the product and get it used in such a fashion. We also have other products in the pipe which may be more suitable, it's not stuff for forums though.

matt at genesi-usa.com

Reply to this post
Reply [edit]

Poster: dunno Date: Mar 3, 2005 6:43am
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

about the "We really want a micro-atx motherboard that has gigabit, low power, 4 sata ports. alas, we can not find such a beast yet."
specifically the gigabit part.

how many nodes are you thinking of, and then how large is your LAN pipe, and how large is your WAN pipe... because you have a lot of nodes, and gigabit switches aren't cheap. what would you do with the bandwidth?

Reply to this post
Reply [edit]

Poster: Administrator, Curator, or Staffbrewster Date: Mar 3, 2005 1:06pm
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

our wan bandwidth is 1gbit. the cost of gigabit switches are reasonable at this point, for us.

with a 1.6TB node, it takes 1.6days (roughly) to copy it, so that is a very long time. it would be nice to have that down to serveral hours.

-brewster

Reply to this post
Reply [edit]

Poster: jko Date: Mar 31, 2005 1:41pm
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

Isn't that .16 days, to transfer 1.6TB, assuming you get full (un-realistic) 1 Gbps. Even at half, it's still less than 10 hrs.

Reply to this post
Reply [edit]

Poster: James Day Date: Apr 1, 2005 6:47pm
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

1.6 days was probably the time at 100 megabits.

In practice, Wikipedia sees closer to 20-40 megabytes per second over local gigabit links. Probably not coincidentally, typical router traffic levels in colocation sites tend to have about a 340 megabits per second traffic limit in practical use.

Can definitely see why the Internet Archive wants a faster network connection. Gigabit switches are inexpensive compared to the costs they reduce. Notably getting things done within easy human attentions span or getting a system working again after a problem.

Trying to maintain a high traffic, high availability place can significantly change views of what is optional and what is necessary. Things like power distribution units with meters and alarms so someone is less likely to take out a whole rack or site by overloading a 30 amp circuit. You haven't lived until something avoidable like that has taken a popular site down for half a day because you didn't spend (or didn't have available to spend) $350 or so for a metered PDU and your colo wasn't watching.

Brak's earlier comment about costs was also spot on. There's a lot more to factor in than the obvious bits when you have to be reliable and need to price the whole system.

Reply to this post
Reply [edit]

Poster: indianews Date: Jan 21, 2010 7:27am
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

your information was very helpful. Thanks

Reply to this post
Reply [edit]

Poster: dunno Date: Jun 11, 2005 7:45am
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

The move from the low power eden VIA processors to the expensive, 68.5W max TDP P4 is a bit puzzling. of the chips I would expect you to look for, a MIPS based board, or a G4 (ppc6xx?), or a mobile pentium (banias/dothan), a transmeta, or a P3... you choose instead, a desktop P4, rather 40 desktop P4's per 42U rack.

now, I suppose you were locked in to a micro-ITX board because of your custom case, but I am under the impression that socket 370, and socket 479 (P3, and Banias/dothan) boards are available for that format. perhaps they were more expensive (the banias/dothan boards), or not available with gigabit (P3?).

anyway, I can't imagine it would be easy to run a P4 2.8 in a 1U rack, much less 40 in a 42 rack, and I would imagine it would drastically increase the HVAC burden, and the kW/hr.

Reply to this post
Reply [edit]

Poster: James Day Date: Jun 11, 2005 12:27pm
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

I'm a Wikipedia person so I can't really comment much on that petabox CPU choice. Maybe brewster will.

On the Wikipedia side, we've recently started storying three identical copies of our old versions of article text on our P4 Apache web servers/page builders, to use their hard drives for something useful. The Petabox might have some similar processing power need or might just not have been able to get what they wanted in the available alternative options. Wikipedia hasn'yet given much thought ot kW/hr and HVAC loads, perhaps mostlybecause they are just part of the package at our current hosting place.

Reply to this post
Reply [edit]

Poster: viswiss Date: Jan 23, 2010 7:55am
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive


Your post is really informative for me. I liked it very much.
Keep sharing such important posts.

Reply to this post
Reply [edit]

Poster: Curator at the Security Digest Archives Date: Feb 17, 2005 8:02am
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

You need some custom hardware, it seems to me: i.e. a DIMM-PC or gumstix style form factor. Essentially, it needs to be the same size as the 3.5' form factor so that it can fit snuggly onto the back of your IDE drive. The hardware needs only a 200+mhz processor, 64mb ram, ethernet and IDE controller; it's not a tough call to do this on that size of PCB. The hardware could run uLinux or NetBSD to export CIFS, NFS, ATA over Ethernet or whatever you require - effectively a DIMM size SAN. This type of hardware would be low power, give off little heat, be cheap and reliable. Not only that, I'm sure if you over-produced and sold some of them off, you'd be able to recover some cost. The VIA boards are great, but the real estate, MTBF and power drain would really have to add up at petabox scale.

Reply to this post
Reply [edit]

Poster: Administrator, Curator, or StaffBrak Date: Feb 17, 2005 1:23pm
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive


You make a good point and I've often thought of something similar but it's not cost effective yet.

A good way to look at the equation is the delta per block device. Being vague about block device since sizes and type are many. The services you need to provide are access and power at a minimum and perhaps processing, more on that later.

Access ~ Megabits/Block Device, switching (level of oversubscrition on uplinks, if any), cabling, etc.

Power ~ Cabling, Power supplies, etc..

Cooling ~ putting thousands of these next to each other is more challenging that putting one on your desk.

A desktop PC version could be as cheap as a sempron board with CPU and a desktop powersupply and a linksys switch.

Stick it into a spreadsheet, play around..

The upshot in all this is processing. Once the scale is hundreds of nodes, and thus thousands of disks, the incremental cost, if any, of real CPU's, when compared to other models can be quite compelling.

Once you have all the variables plugged in, you'll notice the costs of things like having 4 100Mb ports vs. 1 1000Mb port. You'll also see how choosing between 4 Arm CPU's, 1 VIA or a Xeon affect your monthly power bill (don't forget about the AC.)

Even after all of that, some kind of erector set needs to suspend all this stuff (maybe in cases, maybe not) in some kind of room... keeping everything cool.

Reply to this post
Reply [edit]

Poster: foundation Date: Nov 10, 2004 11:58pm
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

It would certainly be more expensive than a micro atx board (by an order of magnitude) would be a embedded board like a compact pci board (like http://www.pt.com/products/prod_zt5515.html) with a 4 port serial ata pmc add on board the trick is that most telecom solutions aren't oriented around storage so the mounting and powering of the hard drives would be the trick.

A more relevant solution might be the new Pentium-M desktop boards that DFI is releasing (site is down right now so specs are guesses) but I think they have GbE onboard and a couple serial ata ports (maybe not 4 but an add on card would solve that)


Reply to this post
Reply [edit]

Poster: indianews Date: Jan 21, 2010 7:28am
Forum: petabox Subject: Re: about 400TB of this design shipped to the Archive

Nice post with good information.